Analysing and Classifying Names of Chemical Compounds with CHEMorph

نویسندگان

  • Gerhard Kremer
  • Stefanie Anstein
  • Uwe Reyle
چکیده

We present a prototypical system with a purely linguistic method to analyse organic chemical compound names. It morpho-semantically analyses compound names, generates line-based, machinereadable representations of their corresponding molecular structures (SMILES strings), and triggers a taxonomic classification. CHEMorph is to be used to support manual database curation and as a basis for biochemical text processing. The system is written in Prolog.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysing Names of Organic Chemical Compounds -- From Morpho-Semantics to SMILES Strings and Classes (Web Version)

The linguistic analysis of chemical terminology is a key to biochemical text processing and semi-automatic database curation. The system described analyses systematic and semi-systematic names of chemical compounds, class terms, and also otherwise underspecified names by means of a morpho-semantic grammar developed according to IUPAC nomenclature. It yields an intermediate semantic representati...

متن کامل

سیستم شناسایی و طبقه بندی اسامی در متون فارسی

Name entity recognition (NER) is a system that can identify one or more kinds of names in a text and classify them into specified categories. These categories can be name of people, organizations, companies, places (country, city, street, etc.), time related to names (date and time), financial values, percentages, etc. Although during the past decade a lot of researches has been done on NER in ...

متن کامل

Identifying and Classifying Terms in the Life Sciences: The Case of Chemical Terminology

Facing the huge amount of textual and terminological data in the life sciences, we present a theoretical basis for the linguistic analysis of chemical terms. Starting with organic compound names, we conduct a morpho-semantic deconstruction into morphemes and yield a semantic representation of the terms’ functional and structural properties. These semantic representations imply both the molecula...

متن کامل

Phenolic compounds as chemical markers of low taxonomic levels in the marine algal genus Laurencia in the Persian Gulf

The genus Laurencia(Rhodomelaceae), a complex group, has 285 species and infraspecific names. Identification and taxonomy of these taxa, mainly has been based on flexible morphological characters which have led to a complicated taxonomy in this group. Nowadays, taxonomical study of this group has changed a lot by using reproductive characters, anatomical differences and modern...

متن کامل

eADMIUM AN eADMIUM COMPOUNDS

Synonyms, trade names and molecular formulae for cadmium, cadmium-copper alloy and some cadmium compounds are presented in Thble 1. The cadmium compounds shown are those for which data on carcinogenicity or mutagenicity were available or which are commercially important compounds. It is not an exhaustive list and does not necessarily include all of the most commercially important cadmium-contai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006